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ABSTRACT 

Context. The properties of dust grains, in particular their size distribution, are expected to differ from the interstellar medium to the 

high-density regions within molecular clouds. Since the extinction at near-infrared wavelengths is caused by dust, the extinction law 

in cores should depart from that found in low-density environments if the dust grains have different properties. 

Aims. We explore methods to measure the near-infrared extinction law produced by dense material in molecular cloud cores from 

photometric data. 

Methods. Using controlled sets of synthetic and semi-synthetic data, we test several methods for linear regression applied to the 

specific problem of deriving the extinction law from photometric data. We cover the parameter space appropriate to this type of 

observations. 

Results. We find that many of the common linear-regression methods produce biased results when applied to the extinction law from 

photometric colors. We propose and validate a new method, LinES, as the most reliable for this effect. We explore the use of this 

method to detect whether or not the extinction law of a given reddened population has a break at some value of extinction. 

Key words. Methods: data analysis, ISM; clouds, ISM: dust extinction. Stars: formation 



1. Introduction 

The properties of interestellar dust appear to be fairly con- 
stant througho ut the interstellar medium (ISM) of the Galaxy 
dRieke & Lebofskv 1985; Kenv on et alJ 1199^ iLombardi et alj 



I2006at LTones & Hvlandlll98(in Martin & Whittetl ll990h . reflect- 



ing the homogeneous physical conditions that characterize it. 
In the cold molecular cores, however, under lower temper- 
atures and higher densities, the dust grains are believed to 
changej_namely gro w by c oalescence and/or develop ice mantles 
(e.g., IW hittet et a llll988l;IO ssenkopf 1993; Whittet et al. 2001; 
lDrainel2003i:iRoman-Zufliga et al.ii2007i:iSteinacker et al.ii2010i) . 



Measuring these differences using methods other than detailed 
spectral analysis has proved somewhat challenging, but the ad- 
vent of larger and more sensitive telescopes has started to re- 
veal extinction laws toward these regions that depait from the 
ISM typical curves, particularly in the near- and mid-infrared 
regimeQ, putting forward the extinction law as a good indicator 
of grain properties. 

Whereas the extinction law in low-density regions and the 
ISM is well characterized by a power-law (A^ oc A~^) of index 
/3 ~ 1.8, several authors have found a pro nounced fl attening of 
the extinctio n law in high density regions. ' Lutz et al. ( 1996) and 
iLutzl (1 19991) first noted a flat extinction law toward the Galactic 
center in this wavel ength range us ing spectroscopy of hydrogen 
recombination lines. iNishivama et al. (2006) confirmed a flat ex- 



* Based on observations collected at the European Organisation for 
Astronomical Research in the Southern Hemisphere, Chile (ESO pro- 
grammes 069.C-0426 and 074.C-0728.) 

' We will refer to near-infrared as the wavelength regime from 1 to 
2.5 ^m, and the mid-infrared from 3 to 8 yum. 



tinction law toward the Galactic center usin g the colors of re d 
clump stars, later confirmed independently by Fritz et al. feOllh . 
Another example was found by .Indebetouw et al.. (.2005.) . using 
a different method based on Spitzer photometry, in the direc- 
tion of and around the star forming region RCW 49. A num- 
ber of other studies on the extinction law using Spitzer fol- 
lowed, namelvlFIahertv et al.1 (|2007|) . IChapman et al.l (l2009l) and 
iMcClurd (l2009h . who found a gray extinction law for sta r form- 
ing regions and molecular cloii ds; IChapman & Mundvl (l2009h 
and iRoman-Zuniga et a T (200/), who found a gray extinction 
law for cloud cores; and Nishivama et al. ( 2009i), who f ound a 
similar law for the Galactic center. Chapm an & Mundvl (|2009|) . 
IChapman etal] (I2009.) and .McClure. (.2009!) go a step furflier by 
analyzing the dependence of the extinction law with extinction, 
finding that the extinction law becomes grayer at higher extinc- 
tion regimes. More recently, Cambresy et al. (2011) measured 
an actual change of the extinction law within the same region 
for a threshold of Ay - 20 mag in the Tiifid Nebula. 

In this paper we address the issue of determining the extinc- 
tion law from photometric data alone and the biases inherent 
to some of the fitting methods used frequently in the literature. 
We begin by defining the mathematical problem and the possible 
methods to extract a linear fit (Sect. |2]l, validating each method 
with synthetic data (Sec.[3]l. In Sect. |4]we address the issue of 
detecting a flattening of the extinction law with extinction. 



This is the first paper in a series of two. The following paper 
(Ascenso et al., in preparation) will apply the results reported 
here to actual observations of cores in the Pipe Nebula. 
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2. The extinction law from photometric data 

2.1. Defining the problem 

The problem of deriving the extinction law from photometric 
data is one of linear regression: the goal is to determine the slope 
of the line that best fits the reddening-displaced positions of the 
stars in a color-color diagram. This slope, /3, is the ratio of two 
color excesses that compose the color-color diagram, e.g., /3 - 
E^-k/Eh-k in a a - K vs. H - K diagram, and is, in this case, 
related to the extinction law A^/A^ by: 



Aa_ ^ (Ah _ \ gj-.g 
Ak \Ak I Eh-k 



+ \ 



(1) 



Linear regression, however, is not a simple science. The pres- 
ence of errors in both coordinates, that they vary as a function 
of the quantities being analyzed (heteroscedasticity), that they 
may be correlated, and the presence of intrinsic scatter, make 
the required linear regression analysis far more complex than 
the typical chi-squared minimization of residuals. Problems of 
this nature, in particular applied to astronomical analysis, have 
been debated in the liter ature for over 50 years dSearesl[l944t 
iTrumpler & Weaverlll953D . although every specific case seems 
to require a careful consideration of the methods to use. 

The particular distribution to which we presently wish to fit 
a linear function has the following characteristics: 

- The X and Y variables are photometric colors, obtained by 
subtracting the magnitudes of a star in two bands. Both vari- 
ables are therefore subject to photometric errors that increase 
with magnitude but not in an entirely predictable way with 
color. 

- Because we observe (parts of) a non-homogeneous cloud, 
the amount of extinction is not the same for all stars, and 
since most of the area is at low extinction, the distribution 
of points in a color-color diagram is denser in the bluer end, 
and scarcer in the redder end. 

- The X and Y colors will usually have one band in common 
{e.g., J - H and H - K), which causes the errors to be corre- 
lated (anti-correlated in the example). 

- Because we observe the colors of a random sample of stars 
background to the cloud, the data will have intrinsic scatter 
caused by the range in spectral types of the stars. The intrin- 
sic scatter is unrelated to extinction. 

To summarize, the position of each datapoint in the color- 
color diagram is determined by three factors: the intrinsic scatter 
from the dispersion in spectral types of field stars, the reddening 
caused by dust extinction, and the measurement (photometric) 
error. The first alone would trace the loci occupied by the unred- 
dened colors of the population of stars in the field, mostly giant 
and main-sequence stars; the eff'ect of extinction is to (dim the 
objects and) move each point along the line whose slope we want 
to determine; and the photometric errors scatter the points in an 
ellipse around the intrinsic, reddened position (not a circle be- 
cause the errors are correlated). A proper method to measure the 
slope of this distribution should be robust enough to disentangle 
these eff'ects and return the single slope of the reddening vector. 



2.2. Methods for linear regression 

In this paper we test the following methods for linear regression 
applied to the problem described in the previous section. 



2.2.1. Least squares fitting 

The ordinary least-squares (OLS) fit is the simplest approach to 
linear regression. It works by minimizing the vertical distance of 
all data points to successive lines in the 2-D (slope and intercept) 
parameter space. Formally, it is equivalent to finding apis and 
/3oLS that minimize the quantity: 



X^(aoLS,/3oLs) = Y^iji - aoLS - /30LSXif 



(2) 



i=l 



where (x,-,}/,) is the /''' data point, and a and/3 are the intercept 
and the slope one is trying to find. This method treats all data 
points the same, even though the position of some points will be 
more uncertain than that of others. The knowledge of the mea- 
surement errors for each data point can be used to attribute dif- 
ferent weights to the different data points, thus optimizing the fit, 
which is equivalent to finding owls and /3wls that minimize the 
quantity: 



X' 



'{awLS^PwLs) - / ^ 



{ji - awLS - PwLsXif 



,=i o"y,+/3ksO"x; 



(3) 



where o-f, and cr,,,- are the measurement errors of the /''' data 
point. This method is called the weighed least squares (WLS) 
method. 



2.2.2. Symmetrical metliods 

Applications of the least squares method have been suggested to 
be most appropriate when the intrinsic scatter in th e data domi- 
nates over the measurement errors (see llsobe et all (11990) for a 
detailed discussion and formal description). As opposed to the 
method described above, these treat both variables symmetri- 
cally, in such a way that it is no longer meaningful to speak of 
dependent and independent variables. 

Two of these methods are based on calculating the OLS slope 
of the Y vs. X distribution 0i - OLS(Y|X) following the no- 
tation of llsobe et alJ (Il990h . and that of the X vs. Y distribu- 
tion, 02 - l/OLS(X|Y). The best fit is the Hne that bisects the 
OLS(Y|X) and the OLS(X| Y) in the bisector method (eq. lU, or 
the geometric mean of the OLS(Y|X) and OLS(X|Y) slopes in 
the geometric mean method (eq. |5]l. A third method, orthogo- 
nal regression (eq. |6), minimizes the distances of the points to a 
model line, but perpendicularly t o the line instead of vertically as 
the regular least-squares method, llsobe et all (1 19901) clearly state 
that these three methods do not produce the same or equivalent 
solutions. The plus or minus in the equations below refer to the 
sign of the covariance of the two variables — positive if they are 
correlated, negative if anti-correlated. 



Py02 - 1 + ^il+P\)i\+Pl) 
Pbisector ~ ^ ^ \^) 



A 
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h-K±^^ + ih-K)' 
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(5) 



(6) 
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2.2.3. Binning 

Binning data always causes loss of information, but, sometimes, 
what is lost in information, is gained in simplicity of analysis 
and results. In their studies of extinction by molecular clouds, 
ILombardi et al.l (l2006b) developed a method to determine the 
extinction law that consisted in binning the colors in both axes 
and fitting the bins using the weighed least-squares fitting de- 
scribed in sect. 12.2.11 The characterization of the data changes 
entirely, because (1) the data points are no longer measurements 
but the weighed average of many measurements, (2) the eiTors 
are no longer measurement errors but the dispersion of colors 
within each bin, and (3) the fewer high extinction points are 
given more weight than before as they are represented by a bin 
with the same weight as those bins containing more points. The 
latter point implies, although with the potential of introducing 
small number statistics issues, that the weight of any extinction 
range is the same, regardless of where the majority of datapoints 
lie. In this way, instead of the fit being weighed by the more 
abundant points at low extinction, as is the case for the fit to all 
datapoints, it is equally weighed by a much larger range of ex- 
tinction thus allowing a better constrain of the extinction law. 
It also implies that part of the intrinsic scatter problem is elim- 
inated as the high extinction population will be dominated by 
giants whose range in intrinsic colors is much narrower. 

For the purpose of these experiments we have binned the data 
in two ways: (1) along the color in the X-axis in bins of color, 
and (2) iteratively along an assumed reddening vector in bins of 
Ay following the method described bv lLombardi et al.l (l2006bl) . 



2.2.4. BCES metliod 



The standard BCES method (lAkritas & Bershadvl[T996l) starts 
from the simple observation that the slope yS that minimizes the 
standard OLS Eq. (1) can be alternatively written as 



ySc 



Cov(x, y) 
Var(x) 



(7) 



where Cov(x, y) is the observed covaiiance between the data {x,) 
and {yi}, and Var(x) is the variance of {x,). These two quantities 
can be evaluated from the usual equations 



1 " 
Var(x) = - ^(x,- - xf , (8) 

;=0 

Cov(x)=i J(x,-x)0;,-y), (9) 



where the bar indicates the average values. Note that for the spe- 
cific purposes of the BCES method the variance and covariance 
of the data should be evaluated using A^ in the denominator (in- 
stead of the more common N -I, used when estimating the vari- 
ance and the mean from the same dataset). 

The equation for /? above suggests that we can easily take 
into account the presence of measurement eiTors on both x and 
y by simply subtracting their effect from the estimates of the 
covariance and variance. To this purpose, suppose that each point 
(xi,yi) in our dataset is affected by a statistical error, so that the 
measured values (x,-,}),) = (x/,y/)-i-(e:^, e^). The quantities (e:*, e^) 
represent the eiTors, and are drawn from some distribution; the 
eiTors on x and y are not assumed to be uncorrelated here. The 
presence of the errors changes the covariance and variance in the 
expression of/? above as follows 



Covix,y) H^ Cov(x,y) + CovCe"*, e^) , 
Var(x) H^ Var(x) + Yar(e'') , 



(10) 
(11) 



where we have introduced the variance and covariance of the er- 
rors measurements. Since the true, original variance and covari- 
ance is relevant for the estimate of y6,the equations above suggest 
that in presence of measurement errors, we can replace the equa- 
tion above with 



/^B 



Cov(x,y)-(Cov)(e\ey) 
Var(x) - (yflr)(e^) 



(12) 



Note that, in contrast to (x, y), the variance and covariance of 
the measurement errors is assumed to be known and cannot be 
derived from the data alone. For example, in our case, if {x,y) 
are two colors, say x - H - K and y = J - H,we will have 



Cov(e^ e>') = -al , 
Var(e^) = cr^ + 4 . 



(13) 
(14) 



where cth and ctk are the mean photometric error in the H and 
K bands, respectively. 

2.2.5. A new metliod for linear regression: LinES 

The standard BCES methods is very simple to implement, but, 
although, in principle, it should be able to work in presence of an 
intrinsic scatter, in practice, at least in the original version pre- 
sented bv lAkritas & Bershadvl(ll996l) . it is does not. The authors 
do mention the possibility of accounting for intrinsic scatter in 
the data, but only when x is measured without eiTor, which is 
not applicable to the case of the extinction law. Additionally, in 
the situations that they consider, the amount of intrinsic scatter 
is taken to be unknown. In our case, however, we can estimate 
this quantity directly from a control field with no extinction: the 
variance and covariance of the colors there is just the sum of the 
variance and covariance due to the intrinsic scatter in color of 
stars, and by the photometric errors for the control field. This 
way we can derive the slope using the following expression: 



A 



LinES 



Cov(x,y) - Cov(e-'", e>') - Cov(x''^, y"-^) + Cow{e'f\ e'fy) 



Var(x) - Var(e-^) - Var(x''/) + Var(e''A) 



(15) 



We call this estimate the LinES method (Linear regression 
with Errors and Scatterfl As shown in section U.4.71 this method 
is robust against the presence of correlated errors in both vari- 
ables, and against the presence of intrinsic scatter. 



3. Method validation 

We tested the methods described above using controlled sets of 
synthetic data. The following sections discuss these data, and the 
effects of varying the data parameters on the results produced by 
each method. 



A code for fitting data with LinES will be available online. 



J. Ascenso et al.: The extinction law from photometric data: methods 




J-band magnitude 

Fig. 1. Observed 7-band luminosity function of a control field, 
used as a model distribution for the unreddened J luminosities 
of the synthetic data. 



the median extinction of the Pipe, and with a width of cr = 0.46, 
chosen so that the number of stars at high extinctions approxi- 
mately matches that for the observed data of the Pipe cores (Fig. 

m. 



PDF 



V2^ 



(16) 



Because the probability distribution function (PDF) for the ex- 
tinction is a lognormal, there will be many stars at low extinction 
and progressively fewer stars at high extinction. 

Ak was then calculat ed from the relation AkIAv - 0.112 
(iRieke & Lebofskvlll985h and the values of extinction drawn 
from eq. ( fTSI l were applied to the 7, H and K brightnesses us- 
ing the extinction law characterized by; 



AhIAk^ 1.55 
AjlAK^l.55{Ji+\)-l3. 



(17) 
(18) 



The second equation, derived from eq. ([Q for a (J-H) vs. (H-K) 
diagram, defines /? = E(J - H)/E(H - K), which means the 
reddening vector has a slope of y6 in a (7 - H) vs. (H - K) color- 
color d iagram. Ah/Ak - 1.55 is adopted from llndebetouw et al.l 
(l2005l) . 




60 
A, [mag] 

Fig. 2. Log-normal model distribution of extinction Ay for a 
generic cloud. The dotted line corresponds to Ay - 2.5 mag, 
the median extinction of the Pipe Nebula. 



3.1. Synthetic data 

We constructed a set of simulations meant as controlled, realis- 
tic datasets c omparable to the observed data for the Pipe Nebula 
cores (.Lombardi et al.ll2006bt lAlves et aDl2007') . The simulated 
data consisted of a number of points (stars) characterized by 
brightnesses in three bands - arbitrarily J, H and K - aff'ected 
by a value of extinction that obeys a predetermined extinction 
law. We simulated three sets of data to test the different aspects 
of each method. 

For all sets, the J brightnesses were drawn randomly from 
the distribution of / luminosities from an observed unreddened 
control field toward the galactic bulge (Fig. [1}. 

The synthetic extinction profile was defined as a lognormal 
distribution (Eq. ( fTSI l) centered at // = log(2.5), the logarithm of 



3.1 .1 . Set 1 : Homoscedastic data, no intrinsic scatter 

For the first set, the H and K brightnesses were derived from J 
assuming that all stars have t he typical int rinsic color of giants 
(J-H = 0.7 and H-K = 0.15, iBessell & B rett 19880. Each star 
was then assigned a value of extinction drawn randomly from the 
extinction profile (eqs. (fTSI l - (fTST l. Fig.|2]l. 

In addition to reddening, each star was assigned eiTors in 
J, H and K, to simulate the photometric errors inherent to real 
data. The errors were first applied in the simplest possible way: 
independently of brightness. The magnitudes of the errors were 
drawn randomly and independently for J, H and K, from a nor- 
mal distribution with yU = and cr = 0.05, so that 95% of the 
stars will have errors below 0. 1 mag, the typical acceptable er- 
rors for photometry. A random value from these distributions 
was then added to the magnitudes of each star in all bands inde- 
pendently. The synthetic data produced in this way will hence- 
forth be refeiTed to as Set 1, or homoscedastic dataset, since the 
errors do not scale with the variables. 



3.1 .2. Set 2: Heteroscedastic data, no intrinsic scatter 

A second approach was designed to reproduce the fact that the 
error in real observations does in fact increase with magnitude. 
We modeled this dependence using an error distribution in the 
form of a power-law S(m) = Cm-\ where S is the typical eiTor 
associated with magnitude m (Fig. [3]). The normalization con- 
stant C was set so that 90% of the 25''' magnitude stars have an 
error up to 0.3 mag, and the index x, that defines how rapidly 
the errors increase with magnitude, was set to 4. Both parame- 
ters were empirically chosen to produce a curve that resembles 
a typical eiTor distribution of the NIR data, but other combina- 
tions around these values would also be good representations of 
the general distribution of photometric errors in a sample, and do 
not change the results. This function was then used to determine 
the width of the Gaussian from which the errors for each star 
were drawn according to its magnitude, the etTor for the bright 



The exact value of these colors is not relevant for the tests. 
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Fig. 3. Width of the error distribution as a function of magnitude 
used to generate the "photometric errors". The insets show the 
actual error distribution for stars of magnitude 15 (left) and 25 
(right). 



stars being drawn from a narrow Gaussian, and that for the faint 
stars being drawn from a wider Gaussian (see the insets in Fig. 
[3]l. The synthetic data produced in this way will hereafter be re- 
ferred to as Set 2, or heteroscedastic data, since the errors do 
scale with magnitude. 

Figure |4] shows the construction of the synthetic data in the 
color-color diagrams for a given realization of heteroscedastic 
data without intrinsic scatter 



3.1 .3. Set 3: Heteroscedastic data with intrinsic scatter 

The final experiment was meant to represent the fact that the 
stars being observed through the cloud do not have a single color, 
but are rather distributed along the main-sequence and giant stars 
loci according to their individual masses and ages. This intrin- 
sic "scatter" about a single colour was modeled by assuming a 
fraction / of main-sequence-to-giant stars, and randomly popu- 
lating the loci accordingly, regardless of the stars' brightnesses. 
This does not produce a completely realistic set of data because 
the position of the stars would depend on their brightness (see 
Sect. 13.2b . but our goal is to test how the presence of a generic, 
non-symmetrical intrinsic scatter affects the measurement of the 
slope of the extinction law. 

In this dataset there are then three contributions to the dis- 
tribution of points in the color-color diagram: the distribution 
of intrinsic colors of stars background to the cloud, that spread 
the stars along the main-sequence and giant loci; the extinction, 
that moves each point from its intrinsic position along a line 
whose slope is determined by the properties of the dust; and the 
magnitude-dependent photometric error, that, in a color-color di- 
agram, is equivalent to each point being drawn from an ellipse 
around each intrinsic and reddened point, whose dimensions de- 
pend on the observed brightness of the corresponding star 

The data produced in this way will be referred to Set 3, or 
heteroscedastic data with intrinsic scatter 



3.1 .4. Syntlietic control fields 

For each set intended to pose as science data, we generated a 
corresponding set intended to pose as data from a control field. 
For Sets 1 and 2 the synthetic control fields had equivalent ho- 
moscedastic or heteroscedastic errors, respectively, drawn ran- 
domly but independently from the same distributions as the er- 
rors for the synthetic "science" fields. For Set 3, apart from the 
heteroscedastic errors, the control field simulation also included 
an amount of intrinsic scatter equivalent to that of the synthetic 
"science" field. Neither of the synthetic control fields included 
extinction, as they are meant to represent the population back- 
ground to the cloud that is causing the reddening on the "sci- 
ence" field stars. 



3.2. "Real" data 

There is one aspect of real observations that we cannot test with 
the synthetic datasets described above: whereas at low extinction 
both main-sequence and giants can be observed, at high extinc- 
tion the main-sequence stars are more efficiently dimmed below 
the detection limit, leaving a population dominated by the intrin- 
sically brighter giants at redder colors. In our definitions for the 
synthetic data, this means that / should change with extinction, 
being larger at low extinctions and progressively smaller at high 
extinctions. As akeady hinted in Sect. 13.1.31 in set 3 of our syn- 
thetic data we do simulate intrinsic scatter, but the brightness of 
each star does not scale with its spectral type, which translates 
into having a constant / throughout the entire extinction range. 
To test the effect of a varying amount of intrinsic scatter with ex- 
tinction on the methods we applied them to actual observations 
of control fields (courtesy of C. Rdman-Zuniga), which we red- 
dened in the same way as we did the synthetic data. We refer to 
these datasets as "real", keeping the quotes to make clear that 
the actual observed data were then artificially modified. 

The first dataset contains data taken with the SOFI instru- 
ment at ESO's New Technology Telescope, in the direction of 
the galactic disk (10''38'"12', -59°12'02", J2000.0), on the night 
of March 31", 2006. This dataset, which we refer to as "disk 
dataset", contains 548 stars with PSF photometry in the J, H 
and Ks filters. This is not the ideal dataset for two reasons: first, 
it contains few stars, and second, since this is a field in the galac- 
tic disk, it already has some extinction. 

The second dataset contains data also taken with the SOFI 
instrument at ESO's New Technology Telescope on the night 
of June 22""', 2002, but in the direction of the galactic bulge 
(17''08'"10', -28°03'03", J2000.0). This dataset, which we refer 
to as "bulge dataset", contains 1071 stars with PSF photometry 
in the J, H and K^ filters. 

5000 "science" subsets were drawn randomly from each of 
these datasets, and extinction was applied to each star from the 
extinction distribution (see Eq. ([TJ, Figure |2]l. Similarly, 5000 
"control field" subsets were drawn randomly from each dataset, 
and used as they were. 450 and 800 stars were drawn from the 
disk and the bulge datasets, respectively, in order to keep the 
possibility of the "control" and "science" subsets being made of 
different stars. 

For the effects of these experiments, the two sets differ in 
three aspects: the bulge dataset contains more stars, does not 
have extinction, and is made up mostly of giant stars, whereas 
the disk dataset contains fewer stars, has akeady some extinc- 
tion, and is more likely to have a higher fraction of main- 
sequence objects. The (J - H) vs. (H - K) color-color diagrams 
of the two datasets are shown in Figures |5] and |6] and illustrates 
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Fig. 4. Example of the generation of synthetic heteroscedastic data without scatter. From left to right: all synthetic stars have the 
same intrinsic color; they are reddened according to the log -normal distribution of Ay; photometric error that depends on magnitude 
is added; and finally a magnitude cut is implemented, reducing the size of the sample and the coverage in Ay. 



these differences. In both figures, the left panels are the original 
datasets, and the middle and right panels show one realization of 
the extracted "control field" and "science" subsets used for the 
tests, respectively. 

The only caveat regarding these datasets is that, when red- 
dening the stars to pose as science subsets, their magnitudes 
change, so their associated errors should change accordingly. 
However, since these are real observations and given the sta- 
tistical nature of the errors, that adjustment is not possible. In 
practice, this means that there will be stars at high extinction 
with an underestimated associated error, but for the effect of our 
tests this is not critical, since there continues to be no clear de- 
pendence of the error with extinction, as is the case in real data 
subject to extinction. 

3.3. Parameters 

To test the robustness of the methods, the synthetic datasets were 
generated using a range of parameters, namely input slopes of 
the reddening vector, amount of intrinsic scatter, size of the sam- 
ple, and other specific parameters only relevant to some of the 
methods. 



Input slope Each set was generated with seven values of input 
slope p in the range [- 1 .0, 3 .0] to cover the range expected for an 
extinction law in the near- and mid-infrared. The methods were 
tested under ideal conditions of number of stars and Ay cover- 
age to test only the ability of the methods to deal with different 
values of /3. The input slope was varied to guarantee that our 
conclusions are not only valid for one specific value of yS. While 
varying the remaining parameters the input slope was fixed at 
1.8. 



Magnitude limit The magnitude limit was parameterized by me- 
lt corresponds to setting a brightness limit in real data, and it 
was applied identically in J, H and K such that all the synthetic 
stars fainter than nic in any band after applying the extinction, 
are discarded from the fit (see rightmost panel in Fig. 2). A mag- 
nitude limit is naturally set in real data (detection limit), but is 
also something one might consider doing artificially to eUminate 
those stars with the largest photometric errors. 



Table 1. Effective number of stars as a function of magnitude 

cut. 



m. 


Ns 


Nf 
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(H-K)„,„^ 


25 


5000 


4929 


0.99 


2.32 


23 


5000 


4850 


0.97 


1.98 


21 


5000 


4550 


0.91 


1.65 


19 


5000 


2930 


0.59 


1.35 


17 


5000 


680 


0.14 


1.04 



Decreasing nic is equivalent to reducing the size of the sam- 
ple, both in number and in range of extinction (the stars at high 
extinction will likely be fainter), while simultaneously reducing 
the range of errors in Sets 2 and 3 (fainter stars will have larger 
errors). By construction, a value of nic = 25 corresponds to al- 
lowing the largest errors to be 0.3 mag (see sect. 13.1.2) . 

Table \T\ shows the averagqj number of stars from Set 3 that 
survive each magnitude cut (effective number of stars, A^^") 
from an initial sample of Ns - 5000 synthetic stars. The ratio 

fsci - N'^s /^s reflects the functional form of the model distribu- 
tions and, as such, would be the same for any other input num- 
ber of stars Ns for each magnitude cut. The table also shows 
the average H-K color of the most heavily reddened datapoint 
attained for each magnitude cut, illustrating the loss in Ay cov- 
erage with magnitude limitation. The same magnitude cuts were 
applied to the control field, but because the control field does not 
have extinction, the effect of the cut in the effective number of 
stars is much more subtle. The ratio of effective to initial number 
of stars for the control field is 1 .00 for nic < 21 mag, and drops 
to 0.90 and 0.25 for nic of 19 mag and 17 mag, respectively. 



Number of stars Generating fewer stars in the first place also 
changes the size of the sample. The (subtle) difference with re- 
spect to implementing magnitude cuts is that the sample with 
fewer stars and no magnitude cut will most likely have a broader 
range of Ay than would a richer sample with magnitude cut, as 
some of the fewer stars that are generated and kept may still be 
faint and heavily extincted from the random drawing process. In 
real observations, generating fewer stars without imposing mag- 



Average obtained from 5000 realizations. 
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Fig. 5. (J - H) vs. (H - K) color-color diagrams for the disk dataset. Left: all stars in the dataset. Middle: 450 stars chosen randomly 
from the dataset, representing the "control field". Right: 450 stars chosen randomly from the dataset, and reddened to represent the 
"science field". 



2.5 



2.0 



T^ 1-5 

X 

^ 1.0 



0.5 



0.0 



originaljauige dataset 




"control field" subset 




"science" subset 




1.0 



1.5 0.0 



0.5 

H-K [mag] 



1.0 



1.5 0.0 



0.5 



1.0 



1.5 



Fig. 6. (7 - H) vs. (H - K) color-color diagrams for the bulge dataset. Left: all stars in the dataset. Middle: 800 stars chosen randomly 
from the dataset, representing the "control field". Right: 800 stars chosen randomly from the dataset, and reddened to represent the 
"science field". 



nitude cuts would be comparable to observing a sparse region 
of the sky where the faint and highly reddened stars can be de- 
tected. Imposing a magnitude limit, on the other hand, would 
correspond to having shallow observations regardless of the rich- 
ness of the observed field; in this case, the fainter and more red- 
dened stars would not be detected. 

We tested the methods against varying number of stars in the 
range [100, 5000], both in the science and the control field. 



Amount of intrinsic scatter The amount of intrinsic scatter for 
Set 3 was parametrized by /, the fraction of stars in the main- 
sequence locus with respect to those in the giant locus, taking 
values of 0.01,0.15 and 0.50. Given the shape of the loci, a larger 



scatter is obtained for larger values of /, although the giant locus 
alone still produces some intrinsic scatter. Larger values of / 
would correspond to a large fraction of the stars behind the cloud 
being main-sequence in real data, as opposed to having mainly 
giants. When varying the remaining parameters, / is fixed at 0. 15 
for Set 3 (/ is not a parameter in Sets 1 and 2). 



Number of control-field stars The synthetic control field is only 
used for the LinES method, as the other methods rely solely on 
the science data. Unless explicitly stated, the control field was 
generated with the same number of stars and was subject to the 
same magnitude cuts as the synthetic "science" datasets. This 
means that, after applying a given magnitude cut, there will ef- 
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fectively be more stars in the control field sample than in the sci- 
ence sample, since a fraction of the stars in the science dataset 
will have been dimmed below the magnitude cut by the effect of 
extinction. This mimics real observations in that the control field 
should be obtained from a region of comparable background 
stellar density (same number of stars generated in the simula- 
tions) and using the same instrumental setup (same magnitude 
cut in the simulations) as the science field. To test the effect of a 
less than ideal control field, we tested the LinES method against 
varying numbers of stars also in the control-field datasets. 

3.4. Results from synthetic data 

Each method was applied to 5000 realizations of the synthetic 
data, producing an average value p, and a dispersion 0-;^ around 
the average for each parameter within sets 1, 2, and 3. We define 
bias b as the difference between y6 and the input value of the 
slope {b - fi- jS), and consider a method unbiased if it produces 
an estimate within the 1-cr dispersion of the input value {i.e., 
b < a-p). 

We note that the quoted absolute values of the biases are for- 
mally only valid for the conditions of these simulations, namely 
for the magnitude of the error and the functional form of the 
error distribution with magnitude. 

The results are summarized below. Figures |7] through [13] 
show the bias as a function of varied parameters. 

3.4.1. Ordinary least-squares 

The OLS method has long been known to be biased when there 
are errors in both coordinates. This was observed also in our 
simulations (Fig.|7]i, with the method failing to recover the right 
value of the slope in the majority of the tests performed. It did 
provide an unbiased result for the following combinations of 
input slopes and datasets: yS = -0.5 for Set 1, jS = -1.0 and 
j6 = -0.5 for Set 2, andjS = 1.0 for Set 3, making this method 
unsuitable if one is trying to find precisely y6. 

This method is also not robust against variations in ntc, the 
bias and the dispersion both increasing for brighter ntc in the 
heteroscedastic sample, except for the brightest magnitude cut 
considered, where the bias is suddenly reduced. It also reacts, 
although to a lesser extent, to changes in the amount of intrinsic 
scatter in the heteroscedastic sample (Set 3), the bias increasing 
with /. 

This method is extremely robust against variations in the 
number of stars within the same magnitude cut, although the 
dispersion in the slope increases for fewer stars as would be ex- 
pected from poor statistics, and the absolute value is biased. 

Overall, this method is not a reliable estimator of the extinc- 
tion law. 



3.4.2. Weighed least-squares 

The WLS method performs well in homoscedastic or het- 
eroscedastic data without intrinsic scatter (Fig. |8), suggesting 
that, under these conditions, the method can deal properly with 
the presence of errors in both coordinates, and with them be- 
ing correlated. For these samples the dispersion in the slope is 
remarkably small, making it a very accurate estimator. In the 
presence of intrinsic scatter, however, the method systematically 
fails to recover the input slope whatever its value in the range 
[-1.0,3.0], although it does come close around/? - 2.5. The 
bias as a function of input slope plot for this method and dataset 



(Fig. [8] blue line) suggests that there could be another unbiased 
value of j8 between -0.5 and 0.5, but tests suggest that there is 
instead a discontinuity around/? = 0. 

Although biased for Set 3, this method is robust against vari- 
ations in the number of stars to within 1.5% in the considered 
range, but the dispersion increases steadily for fewer stars. 

Biased in the presence of even a small intrinsic scatter, this 
method is not a rehable estimator of the extinction law. 

3.4.3. Symmetrical methods 

The three symmetrical methods returned biased results for all 
tests (Fig.|9]l, except for input slopes of -1.0 and 1.0 in datasets 
1 and 2 (without intrinsic scatter). The bisector and geometric 
mean methods produce very similar results. The orthogonal re- 
gression method presents the largest biases of the three. 

Besides being biased, neither method is robust against vari- 
ations in ;«c or / in the presence of intrinsic scatter, the bias 
increasing for bright magnitude cuts and more intrinsic scatter. 
The methods are highly robust to variations in the number of 
stars, the dispersion increasing steadily for fewer stars. 

These methods are not reliable estimators of the extinction 
law. 



3.4.4. Binning \n{H-K) 

This method is biased for most datasets and parameters tested; 
the exceptions are for Set 3 with the brightest magnitude cut 
{nic - 17 mag) or for jS = 0.5 and 1.0, and for/? = -0.5 and no 
intrinsic scatter (Fig.fTOt. 

The slope in mostly underestimated for all datasets, with the 
bias increasing for brighter m^ for Set 1, and keeping relatively 
stable against varying nic for Sets 2 and 3. For Set 3 the bias 
slightly increases with /, as does the dispersion. This method 
reacts to the number of stars for a given magnitude cut, the bias 
increasing toward fewer stars. Overall, this is not a reliable esti- 
mator of the extinction law. 



3.4.5. Binning in Ay 

This method is formally unbiased for all tested values of/? when 
applied to the three datasets (Fig.fTTTi. However, whereas the bias 
is always close to zero for the datasets without intrinsic scatter, it 
becomes slightly large for most values of /3 when intrinsic scatter 
is introduced. 

The method is robust against variations in the amount of in- 
trinsic scatter within the considered range, and it is stable against 
variations in magnitude cut, except for the brightest value of nic 
for Set 3, where the smaller number of stars and range in Ay 
result in very few bins for the fit. The dispersion increases to- 
ward brighter magnitude cuts and amount of intrinsic scatter. 
Because it is based on binning, this method reacts significantly 
to the number of stars within the same magnitude cut, the bias 
and the dispersion increasing for fewer stars. 

Small variations occur when the size of the bin is varied, with 
the slope being increasingly overestimated for smaller bins, but 
the method continues to be unbiased within the dispersion. 

This method is a reliable estimator of the extinction law. 



3.4.6. BCES method 

This method is highly reliable and unbiased for homoscedastic 
data and for heteroscedastic data without intrinsic scatter (Sets 
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Fig. 7. From left to right, bias in the slope as a function of the magnitude cut, input slope, number of stars in the science field, 
and intrinsic scatter for the ordinary least squares (OLS) method. The black, red and blue lines represent the results for the 5000 
reaUzations of Sets 1, 2 and 3, respectively. The error bars correspond to the 1-cr dispersion in the results. 
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Fig. 8. From left to right, bias in the slope as a function of the magnitude cut, input slope, number of stars in the science field, 
and intrinsic scatter for the weighed least squares (WLS) method. The black, red and blue lines represent the results for the 5000 
reahzations of Sets 1, 2 and 3, respectively. The error bars correspond to the 1-cr dispersion in the results. 



1 and 2). However, in the presence of intrinsic scatter (Set 3), it 
becomes biased for all input slope values except for/? = 2.1 (Fig. 
[T2] ) for / = 0.15. Since the other parameters were tested using 
an input slope of 1.8 (very close to 2.1), the bias seems small in 
the ific, Nstars and / plots, but we can nevertheless analyze the 
sensitivity of the method to these parameters. For Set 3, the bias 
is mildly sensitive to m^, robust against variations in the number 
of stars, and only minimally sensitive to variations in the amount 
of intrinsic scatter /. 

Since real data will be similar, in essence, to Set 3, we do not 
consider this method a reliable estimator in the specific case of 
the extinction law. 



3.4.7. LinES 

This method is the most unbiased and robust of all presented 
here for homoscedastic or heteroscedastic data, with or without 
intrinsic scatter, for all tested values of the input slope (Fig.[T3]). 
The method is robust against variations in nic and /, num- 
ber of stars in the science field, and number of stars in the con- 
trol field, although the dispersion follows the same tendency as 
before: increases for brighter magnitude cuts, slightly increases 
with /, and decreases with number of science and/or control field 
stars. The bias is larger when there are simultaneously very few 
science and control field stars, and small Ay coverage, but is 
nevertheless better than any of the other methods for the same 



conditions. The dispersion is significantly smaller than the next 
least unbiased method, the binning in Ay, in all cases, granting 
it more precision. 

Since LinES relies on the characterization of the data 
through the properties of a control field, we tested the stabil- 
ity of the method against variations in the number of stars in 
the control field. Given a reasonable number of stars in the sci- 
ence field, the method is robust against variations in the number 
of control field stars. However, if the science field itself does 
not have enough stars or Ay coverage, the bias increases further 
for few control field stars. Invariably, the dispersion increases 
toward fewer control field stars. This method has proven to be 
robust as long as the control field is a good representation of the 
underlying population on the science field, even if containing a 
smaller number of stars. 

The excellent performance of this method while varying all 
relevant parameters validates the LinES method for our case 
study. In general, it will provide accurate results for observa- 
tions of cores with either rich or poor background populations, 
regardless of their spread in spectral types, even for relatively 
shallow observations, as long as there is a reasonable spread in 
extinction and the control field is a good representation of the 
reddened, background population. 
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Fig. 9. From left to right, bias in the slope as a function of the magnitude cut, input slope, number of stars in the science field, and 
intrinsic scatter for the symmetrical methods: the bisector method in the top four panels, the geometric mean method in the middle 
panels, and the orthogonal regression method in the bottom panels. The black, red and blue lines represent the results for the 5000 
reaUzations of Sets 1, 2 and 3, respectively. The error bars correspond to the 1-cr dispersion in the results. 
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Fig. 10. From left to right, bias in the slope as a function of the magnitude cut, input slope, number of stars in the science field, and 
intrinsic scatter for the binning in {H - K) method. The black, red and blue lines represent the results for the 5000 realizations of 
Sets 1, 2 and 3, respectively. The error bars correspond to the l-cr dispersion in the results. 



Limitations The simulations show that LinES is not reliable for 
distributions that do not cover a wide enough range of extinc- 
tion, if the errors are too large. For reasonable errors, like those 
described for the simulations, LinES starts overestimating the 
slope by more than 10% for ranges in x-color (i.e., the color 
plotted on the x-axis) smaller than 0.25 to 0.45 mag for slopes 
between 0.6 and 3.0, respectively, and underestimating the slope 
by more than 10% for the same ranges in x-color for slopes be- 



tween 0.3 and 0.5. This method should therefore not be appUed 
to data that span less than these values in x-color. 



Error estimation We used the bootstrap method (e.g.. 
Wall & Jenkins 2003) to estimate the uncertainty in the slope de- 
rived by LinES. This method consists in randomly dividing each 
sample in two equal-number subsets, and measuring the slope 
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Fig. 11. From left to right, bias in the slope as a function of the magnitude cut, input slope, number of stars in the science field, and 
intrinsic scatter for the binning in Ay method. The black, red and blue lines represent the results for the 5000 realizations of Sets 1, 
2 and 3, respectively. The eiTorbars correspond to the 1-cr dispersion in the results. 
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Fig. 12. From left to right, bias in the slope as a function of the magnitude cut, input slope, number of stars in the science field, and 
intrinsic scatter for the BCES method. The black, red and blue lines represent the results for the 5000 realizations of Sets 1, 2 and 
3, respectively. The error bars correspond to the 1-cr dispersion in the results. 
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Fig. 13. From left to right, bias in the slope as a function of the magnitude cut, input slope, number of stars in the science field, and 
intrinsic scatter for the LinES method. The black, red and blue lines represent the results for the 5000 realizations of Sets 1, 2 and 
3, respectively. The error bars correspond to the 1-cr dispersion in the results. 



of the extinction law in each subset. This produces two values 
of yS from which we derive the standard deviation cr^,. This was 
repeated A^ = 1000 times and the uncertainty in the slope was 
defined as: 



c^/J 



V2A^tt 



A' 



(19) 



We applied this method on the synthetic data and compared 
the results with the dispersion era from the simulations. We 
found that the bootstrap uncertainty is typically 80% of era, the 
dispersion from the synthetic data, and the value we believe is 
a better estimate of the actual dispersion expected for real data. 
Given the consistency of the results against the variation of the 
different parameters, we take the uncertainty on the LinES result 
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Fig. 16. Example of a color-color diagram of a synthetic stellar 
population reddened with a reddening vector with slope 1.5 up 
to(H - K) - 1 .0 mag, and with slope 1 .0 for higher (H-K). 



of our case study to be 1 .25 times the uncertainty derived by the 
bootstrap method described above. 

3.5. Results from "real" data 

Figures [T4l and [TSJ show the bias of each method vs. varying in- 
put slope and varying magnitude cut, respectively, for the "real" 
datasets described in Sec. 13.21 Except for the cases described 
below, these results follow those from the synthetic data, main- 
taining the same behavior and changing only the magnitude of 
the bias. The methods that deserve some further consideration 
are the binning in Ay method, the BCES, and LinES. 

The BCES (bottom-middle panel in the figures) applied to 
these datasets shows the same type of behavior as with the syn- 
thetic data against varying magnitude cuts (Fig. [TSl l. The bias 
is slightly larger than before because there are fewer stars and 
more dispersion. For varying input slopes, however, the bias is 
considerably larger, showing that the method is severely biased 
for some ranges of yS when applied to less than ideal data. 

The binning in Ay method, which we concluded was reason- 
ably unbiased when applied to the synthetic data, shows a severe 
bias also for some values of the input slope fi, in particular for 
the disk dataset (bottom-left panel in the figures). However, this 
happens because this dataset has very few stars, and a very lim- 
ited range in Ay (see Figure ID, which translates into too few 
Ay bins to constrain the extinction vector slope adequately. This 
presents a limitation of the method, away from which it can be 
used with relatively good precision. 

In the "real" datasets, LinES continues to perform beauti- 
fully. Except for the first point in Figure [T5](fcoffow-ng/if panel). 
where the magnitude cut is such that limits the Ay to a very nar- 
row range, the bias is consistent with zero for all other magnitude 
cuts and for all values of the input slope we tested. 



4. Detecting a break in the extinction law 

Grain growth at high densities has been proposed by a number 
of studies (see references in Sect. [1). If it does occur, then in 



dense cores the extinction law will become grayer toward higher 
extinctions, which should translate into a variation of the slope 
of the reddening vector with extinction, either smooth or abrupt 
depending on the nature of the transition between grain sizes. 
This break was detected and meas ured in the Trifid Nebula for 
an extinction of Ay — 20 mag bv ICambresv et all (1201 lb . We 
used the synthetic and "real" datasets to test whether we could 
detect such a break in the extinction law using LinES. The data 
was generated using the exact same method as described above 
for Set 3 with/ = 0.15 and for the "real" dataset (see Sect. 13. 1.3] 
13.21) . but the reddening vector was made flatter at some value of 
Ay, or conversely, at some value of (H-K) color, since the color 
scales linearly with Ay. This produced color-color diagrams sim- 
ilar to that of Fig. [16] where the break is more or less obvious 
depending on the difference between the two slopes. 

For each realization, we divided the sample into low- 
extinction ((H - K) less than a value (H - /T)/,,,,,,), and high- 
extinction ((H - K) > (H - K)iii„it), and determined the best 
fits to the reddening vector in the two groups using the LinES 
method, obtaining two slopes ^iok and jS/,,^/, . This was done for 
increasing values of (H - K)ii,„i, in steps of 0.2, and starting at 
(H - K)n„n, = 0.4. 

Figui'es[T7]and[r8]show the behavior of y6/o,i, (black) andy6/„g/, 
(red) as a function of (H - K)iimit for the synthetic dataset 3 and 
the "real" bulge dataset, respectively, when applied to 5000 real- 
izations. The solid lines show the median curves, and the shaded 
regions represent the 1 -cr scatter from the 5000 realizations. The 
upper panels are for a single-slope reddening vector with slopes 
1.0 (left) and 1.5 (right). The bottom panels show the same dis- 
tributions but for broken extinction laws, with slopes of 1.5 at 
low-extinction and 1.0 and high-extinction, and breaks located 
around Ak = 0.4 mag (left) and Ak = 1.5 mag (right). For com- 
pleteness, we find the same results using the disk dataset, albeit 
with a larger dispersion. 

In the absence of a break, the two curves are indistinguish- 
able except for the lowest and highest values of (H - K)ii,„i,\ this 
is because the subsets used for the fits in these two extremes con- 
tain too narrow ranges in (H-K) to constrain the LinES method. 
When a break does exist, however, the two curves separate dis- 
tinguishably, even if the break is at low extinction. This then 
provides a simple method to test whether the same extinction 
law applies to the full Ay range of a given dataset, or if it would 
rather best be described as a two-segment law. Unfortunately this 
method does not allow for the determination of the actual value 
of the break, but the figures show that the slope of the extinc- 
tion vector at high extinction can be determined with reasonable 
accuracy. In particular, the procedure of measuring the slopes in 
the low-Ai/ and high-Ay regimes provides a much better handle 
on the extinction law at high extinction (red line in the figures) 
than measuring the slope of the entire dataset as a whole (blue 
line in the figures). 

5. Summary 

We tested several methods of linear regression associated with 
the problem of measuring the extinction law from photometric 
data. We found that many of the commonly used methods pro- 
vide biased results caused by the presence of errors in both co- 
ordinates (which are colors), by the fact that they are correlated, 
and by the presence of scatter intrinsic to the underlying distri- 
bution. 

We adapted the BCES method of lAkritas & Bershadvl(ll996h 
to allow a compensation for intrinsic scatter, using a control 
field to characterize the background, unreddened population. We 
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Fig. 14. Bias in the slope as a function of the input slope for the bulge (blue line) and disk (red line) datasets, for all the methods 
tested. The error bars coiTespond to the 1-cr dispersion of the 5000 realizations. 



called this method LinES (Linear regression with Errors and 
Scatter). Using synthetic data, we showed this method provides 
unbiased and correct results, and that it is robust against the vari- 
ation of all relevant parameters (at least) within reasonable lim- 
its, such as size of sample, range of extinction, and amount of 
intrinsic scatter. 

We found that dividing any subset in sliding values of Ay and 
measuring the slopes of each subset can robustly differentiate 
between an extinction law characterized by a single slope and 
one with a break. 

These results can be applied to observations of background 
stars seen through dense cores of molecular clouds, or through 
regions that span a reasonable range of dust density. The charac- 
terization of the extinction law through deep, photometric data 
is a very useful tool to probe the properties of the dust grains 
in these regions, and a "cheap" one when compared with, for 
example, spectral analysis of many individual sources. 
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Fig. 15. Bias in the slope as a function of the magnitude cut for the bulge (blue line) and disk {red line) datasets, for all the methods 
tested. The error bars correspond to the 1-cr dispersion of the 5000 realizations. 
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Fig. 17. Plow and Phigh as a function of {H - K)ij,„i, for 5000 realizations of the synthetic dataset 3. The solid lines show the median 
curves, the shaded regions represent the 1 - cr scatter from the 5000 realizations, and the dashed lines mark the input values Piow 
and/Shigi, of the slopes. Upper panels: 0i„w andy6/„g/, for a single-slope reddening vector with slopes 1.0 (left) and 1.5 (right). Bottom 
panels: same distributions but for broken extinction laws, with slopes of 1.5 at low-extinction and 1.0 and high-extinction, and 
breaks located at Ak = 0.4 mag (left) and Ak = 1.5 mag (right). The blue line indicates the average value of the slope if LinES is 
applied to the whole range of Ay. 
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Fig. 18. PioK and yS/,/^/, as a function of (H - K)u„,i, for 5000 realizations of the bulge dataset. The solid lines show the median curves, 
the shaded regions represent the \-cr scatter from the 5000 realizations, and the dashed lines mark the input values y6/oit andy6/„g/, of 
the slopes. Upper panels: 0iow andy6/„g/, for a single-slope reddening vector with slopes 1 .0 (left) and 1 .5 (right). Bottom panels: same 
distributions but for broken extinction laws, with slopes of 1.5 at low-extinction and 1.0 and high-extinction, and breaks located at 
Ak - 0.4 mag (left) and A/f = 1.5 mag (right). The blue line indicates the average value of the slope if LinES is appUed to the whole 
range of Ay. 



16 



